Towards Weakly-Supervised Action Localization

نویسندگان

Philippe Weinzaepfel

Xavier Martin

Cordelia Schmid

چکیده

This paper presents a novel approach for weakly-supervised action localization, i.e., that does not require per-frame spatial annotations for training. We first introduce an effective method for extracting human tubes by combining a state-of-the-art human detector with a tracking-by-detection approach. Our tube extraction leverages the large amount of annotated humans available today and outperforms the state of the art by an order of magnitude: with less than 5 tubes per video, we obtain a recall of 95% on the UCF-Sports and J-HMDB datasets. Given these human tubes, we perform weakly-supervised selection based on multi-fold Multiple Instance Learning (MIL) with improved dense trajectories and achieve excellent results. We obtain a mAP of 84% on UCF-Sports, 54% on J-HMDB and 45% on UCF-101, which outperforms the state of the art for weakly-supervised action localization and is close to the performance of the best fully-supervised approaches. The second contribution of this paper is a new realistic dataset for action localization, named DALY (Daily Action Localization in YouTube). It contains high quality temporal and spatial annotations for 10 actions in 31 hours of videos (3.3M frames), which is an order of magnitude larger than standard action localization datasets. On the DALY dataset, our tubes have a spatial recall of 82%, but the detection task is extremely challenging, we obtain 10.8% mAP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Action Recognition by Weakly-Supervised Discriminative Region Localization

We present a novel probabilistic model for recognizing actions by identifying and extracting information from discriminative regions in videos. The model is trained in a weakly-supervised manner: training videos are annotated only with training label without any action location information within the video. Additionally, we eliminate the need for any pre-processing measures to help shortlist ca...

متن کامل

Self-Transfer Learning for Fully Weakly Supervised Object Localization

Recent advances of deep learning have achieved remarkable performances in various challenging computer vision tasks. Especially in object localization, deep convolutional neural networks outperform traditional approaches based on extraction of data/task-driven features instead of handcrafted features. Although location information of regionof-interests (ROIs) gives good prior for object localiz...

متن کامل

Guess Where? Actor-Supervision for Spatiotemporal Action Localization

This paper addresses the problem of spatiotemporal localization of actions in videos. Compared to leading approaches, which all learn to localize based on carefully annotated boxes on training video frames, we adhere to a weakly-supervised solution that only requires a video class label. We introduce an actor-supervised architecture that exploits the inherent compositionality of actions in term...

متن کامل

Weakly supervised learning from images and videos∗

With the amount of on-line available digital content growing daily, large-scale, weakly supervised learning is becoming more and more important. In this talk we present some recent results for weakly supervised learning from images and videos. Standard approaches to object category localization require bounding box annotations of object instances. This time-consuming annotation process is sides...

متن کامل

C-WSL: Count-guided Weakly Supervised Localization

We introduce a count-guided weakly supervised localization (C-WSL) framework with per-class object count as an additional form of image-level supervision to improve weakly supervised localization (WSL). C-WSL uses a simple count-based region selection algorithm to select highquality regions, each of which covers a single object instance at training time, and improves WSL by training with the se...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1605.05197 شماره

صفحات -

تاریخ انتشار 2016

Towards Weakly-Supervised Action Localization

نویسندگان

چکیده

منابع مشابه

Action Recognition by Weakly-Supervised Discriminative Region Localization

Self-Transfer Learning for Fully Weakly Supervised Object Localization

Guess Where? Actor-Supervision for Spatiotemporal Action Localization

Weakly supervised learning from images and videos∗

C-WSL: Count-guided Weakly Supervised Localization

عنوان ژورنال:

اشتراک گذاری